With the increased focus on visual attention (VA) in the last decade, a largenumber of computational visual saliency methods have been developed over thepast few years. These models are traditionally evaluated by using performanceevaluation metrics that quantify the match between predicted saliency andfixation data obtained from eye-tracking experiments on human observers. Thougha considerable number of such metrics have been proposed in the literature,there are notable problems in them. In this work, we discuss shortcomings inexisting metrics through illustrative examples and propose a new metric thatuses local weights based on fixation density which overcomes these flaws. Tocompare the performance of our proposed metric at assessing the quality ofsaliency prediction with other existing metrics, we construct a ground-truthsubjective database in which saliency maps obtained from 17 different VA modelsare evaluated by 16 human observers on a 5-point categorical scale in terms oftheir visual resemblance with corresponding ground-truth fixation density mapsobtained from eye-tracking data. The metrics are evaluated by correlatingmetric scores with the human subjective ratings. The correlation results showthat the proposed evaluation metric outperforms all other popular existingmetrics. Additionally, the constructed database and corresponding subjectiveratings provide an insight into which of the existing metrics and futuremetrics are better at estimating the quality of saliency prediction and can beused as a benchmark.
展开▼